SIPHER Products

Main panel

SIPHER’s Qualitative Products

SIPHER’s Data Products

SIPHER’s Quantitative Products



Compare All Qualitative Products

Compare All Qualitative Products
Characteristic Employment and Health Evidence and Gap Map Causal Loop Diagrams
Status Ready Ready
Purpose A visual and interactive resource to locate published systematic reviews on the topic of health and work. A visual and concise representation of SIPHER’s policy areas of interest including for example inclusive growth and housing.
Strengths The primary strength lies in the simplification of complex and diverse research findings. The interactive map contains studies that have explored the relationship between an employment feature and a health and social outcome. The map only contains systematic reviews. Causal loop diagrams are qualitative tools. They capture the structure of the system concisely for a given topic area, which supports the understanding of a complex system structure. SIPHER has developed causal loop diagrams for two main policy areas: inclusive growth and housing. Their strength lies in bringing together information that is typically part of a linear document but spread across multiple sources (such as academic literature) and presenting it visually, which better reflects the underlying complexity. The causal loop diagrams can be used alone or developed into a quantitative model. These diagrams can capture variables that are difficult to quantify and may not be part of a quantitative model, but are still important to consider. WS2 is developing an interactive causal loop diagram that allows users to locate evidence that explores the links between different nodes (variables) on the causal loop diagram, including feedback loops and time delays.
Limitations Does not provide any analysis on the studies identified. Users may not have access to all academic papers that are captured in the evidence and gap map as some of the covered material is not open access. Complex and comprehensive causal loop diagrams can seem overwhelming and may not be easily useable in policy or modelling. In contrast, simplified causal loop diagrams may appear more useable but may not capture all relevant variables. Without quantifying the causal loop diagram, understanding the behaviour of the system can be challenging or even impossible. Since the starting point for developing these maps is often the policy partners’ understanding of the policy systems they are trying to influence, their conceptualisations may differ from causal loop diagrams that have a different starting point. This highlights the fact that causal loop diagrams are often purpose-built.
Variables A range of work related characteristics (including contract conditions, employer attributes and working environment) and health related measures (including physical health outcomes and psychological health). Multiple causal loop diagrams have been developed for different policy areas and variables are dependent on the maps.
Examples / Link with Other Models and Data Informs model building and interpretation of quantitative findings other workstreams have obtained. Causal loop diagrams are used as a basis to inform the building of the dynamical systems models, as well as the interpretation of findings obtained by other workstreams.
Links Further information about the gap map and link to the interactive tool: https://sipher.ac.uk/employment-health-egm/ https://kumu.io/Sipher-Consortium
Contact



Return to SIPHER homepage

Compare All Data Products

Compare All Data Products
Characteristic Synthetic Population (Digital Twin) Health Indicator Data Set Inclusive Economy Indicator Data Set SIPHER-7 Wellbeing Domain Preferences (Survey Data Set) Aversion to Inequality (Survey Data Set) Self-Reported Health and Wellbeing Outcomes (Survey Data Set)
Status Ready Ready Ready In Progress/Ready Soon In Progress/Ready Soon In Progress/Ready Soon
Purpose A data source that contains attribute-rich data at the individual level, with the aim to create a digital twin for every person in the population with a large amount of associated information about each person. A variety of health and mortality indicators for small geographical units (local authorities and LSOAs/MSOAs) for use in statistical analyses and monitoring of area-level health inequalities. SIPHER’s inclusive economy indicators are designed for use in a wide range of statistical and computational models. To represent a multi-dimensional measure of wellbeing, consisting of seven indicators, in terms of a single index metric, equivalent income. To elicit public preferences regarding trade-offs between improving wellbeing and reducing inequality. A dataset with a battery of self-reporting health and wellbeing indicators from a large UK sample, oversampling from Scotland.
Context Individual level data enables us to understand an individuals’ situations, what happens to them over time or when affected by changes due to external events or policies. The lack of a comprehensive register-based system in Great Britain has made it challenging to access data on individuals across multiple domains. The synthetic population helps bridge this gap by providing a representative , attribute-rich dataset reflecting the whole of the GB population that can be used for micro-level modelling. By randomly selecting individuals from a survey and assigning them to small geographical areas based on census statistics, the synthetic population ensures that the distribution of demographic characteristics for all sampled individuals corresponds exactly to the true demographic structure of each area. This enables researchers to simulate policy interventions and explore their potential impact on individuals and households in a granular way. Modelling the impact of public policy on health requires a shared understanding of how we conceptualise and measure health as an outcome. We need a set of health indicators that are meaningful in the context of understanding the effects of policies and interventions of interest to SIPHER, such as those aiming to create an inclusive economy or improve mental health. These indicators are derived from non-synthetic data sources, including administrative data from ONS/NRS, as well as surveys and third-sector data. There are multiple approaches and definitions of what constitutes an inclusive economy. SIPHER has adopted a particular understanding which focuses on economic inclusion, rather than inclusive growth. This approach does not include wider outcomes concerned with health, environment, and wellbeing. The data covers local authorities in the UK, although some measures are only available separately for Scotland and England. The indicator data set can be used to produce a descriptive overview of how local authorities perform across a range of dimensions. In addition, this harmonised time series can be used to monitor changes descriptively over time, between areas and UK nations, or as input data in statistical models. SIPHER’s WS6 team has developed a wellbeing indicator set comprising seven indicators - SIPHER-7. While SIPHER-7 describes people’s wellbeing across these seven indicators, when some indicators improve and others worsen, it is difficult to judge whether overall wellbeing is improving or worsening. The purpose of this part of the project is to collapse the multi-dimensional wellbeing indicators into a single index metric for wellbeing, equivalent income. To do this, four surveys using Discrete Choice Experiments (DCE) were conducted with a sample of the UK public. Participants were asked to review a set of ten choice tasks, each involving two imaginary scenarios described in terms of SIPHER-7, and select which scenario they believed was better. In three of the surveys, participants were asked to complete the tasks from a personal perspective (i.e., which scenario they would want for themselves), and in the remaining survey, participants were asked to complete the task from a social perspective (i.e., which scenario they think would be better for policy makers to bring about for others). The econometrically estimated parameters represent the relative values given to the seven wellbeing indicators of SIPHER-7 by samples of the UK general public. PPublic policies aim to improve wellbeing and reduce wellbeing inequality, but it is not always possible to do both. How do the public balance the trade-off between improving wellbeing and reducing inequality? The relative importance people place on increasing averages and reducing inequalities (or “inequality aversion”) was elicited from a sample of the UK general public (n=53). Respondents participated in one of eleven online discussion groups, where a series of quantitative trade-off exercises were explained and discussed. Each respondent then completed the same exercise individually. The exercises covered aversion to inequality in: (a) an overall measure of wellbeing (equivalent income); (b) lifetime health across otherwise equal individuals; and (c) lifetime health across the rich and poor. Different surveys use different health outcome indicators. Therefore, data might be available for one indicator set when another is required. For example, answers to SF-12 survey items are available but a WEMWBS value is required. This is a large-cross section online survey of the general public (n=12,401) where respondents are asked to self-report their health and wellbeing across a battery of questions. This dataset allows the estimation of a statistical mapping algorithm between the different indicator sets.
Strengths The Synthetic Population is representative of the demographic characteristics of the respective area - down to a low geographical resolution. The strength of the Synthetic Population is that it provides a wide range of information at the level of individuals. This information can be aggregated into groupings of interest (e.g. sex, income groups) and particular geographical units of interest (LSOA/DZ; MSOA; Local Authorities etc.). The method used to develop the synthetic population is referred to as spatial microsimulation. The synthetic population is used in conjunction with other models developed in SIPHER which enable insights into whether an intervention has benefitted a population group of interest. SSmall-area health indicators can be used to monitor area-level health inequalities or as inputs in statistical models. In addition, all health outcome measures can be attached to the Synthetic Population representing area-level health indicators. SIPHER reviewed the available measures and conducted a consensus process with SIPHER colleagues to agree on a final set of indicators. The criteria used were: 1. Interpretability -accessible & meaningful to decision makers, 2. Sensitivity to policy – the indicator can plausibly show the effects of policy. 3. Indicator can show impacts of pandemic on health. 4. Timeliness – refers to the current health state. 5. Availability of timeseries data
6. Changes in mental AND physical health can be separately studied. 7. Regular updates into the future are expected, 8. Comparability – between areas, ideally comparable between England & Scotland, 9. High resolution – available for small areas with LA as a minimum, 10. Disaggregate – available by subgroups (e.g. broken down by age, sex etc).
Its major strength is the wide range of potential applications; from descriptive analyses to studies examining the complex relationships between economic inclusion and health and wellbeing, at both individual and societal levels. The final data set will be made publicly available and can be shared without restrictions. The DCE data on relative preferences allow the calculation of equivalent income - a quantitative preference-based single metric of wellbeing - for any combination of SIPHER-7 indicators. The samples are large (ranging from 1000 to 3000, totalling just under 11,000) and representative of the UK general public in terms of age and sex. Public policies aim to improve wellbeing and to reduce wellbeing inequality. When there is a conflict between these, policy makers need to make difficult decisions. The quantitative data on inequality aversion is derived from discussion groups, where participants had the opportunity to examine the trade-off exercise in detail. The results help inform policy makers on the trade-offs between the two policy aims that members of the public would support. Different surveys have different health and wellbeing indicators, and this dataset allows the estimation of a statistical mapping algorithm between them. This would allow predicting SIPHER-7 information where the relevant variables are not available.
Limitations The accuracy of the Synthetic Population depends on the quality and availability of the underlying data. Some variables may have poor completion rates, resulting in missing data. Despite the high number of participants in the Understanding Society survey, explicit spatial constraints cannot be applied when creating the Synthetic Population. This means that an individual who was interviewed as part of the survey and who is actually residing in place X can be assigned to a variety of places A, B, and C, as long as they match the demographic constraints such as age, sex, marital status etc. Although recent updates of the code have led to more constraints on how to perform this selection process, it is important to remember that the Synthetic Population only provides associations and descriptive statistics. It can only ever serve as an approximation of the true UK population, which is likely to be more heterogenous and diverse than the population captured in the survey. Therefore, it is important to acknowledge that the Synthetic Population is a synthetic dataset representing individuals who responded to a survey, and not a true register of individuals in a particular area. The data set cannot resolve situations where no data is available at all or where sampling in surveys is not representative of small geographical units. The data set is subject to a high level of geographical harmonisation as well as a thorough review process. However, for a few of the indicators, indicator definitions differ between countries (e.g., there are different definitions of fuel poverty in use in Scotland and England). In these cases, national deciles were created and comparable alternative indicators were identified (e.g., food insecurity as an alternative cost-of-living indicator). Currently not available. Currently not available. Currently not available.
Geography Individuals in the Synthetic Population have a geography associated with them (DZ/LSOA). This allows all levels of geography upwards from DZ/LSOA Level for Scotland, England and Wales - excluding Northern Ireland - to be explored with modelling and analyses. The exact geographical resolution is indicator-dependent. Typically, the following resolutions are available for Mortality: DZ/LSOA Level for Scotland, England and Wales and LA Level Longitudinal (2017-2021) and geographically harmonised data is available at the level of local authorities (lower tier/district level) in England, Scotland, and Wales. The data set covers all 363 local authorities in their 2021 boundaries according to ONS definition. The surveys collected data from participants resident in the UK with sampling quotas for age and for sex. UK with sampling quotas for age and for sex. The survey collected data from participants resident in the UK with sampling quotas for age and for sex. Oversamples Scotland.
Variables / Indicators A large variety of variables are included. This includes all variables included in the Main Stage Survey of Understanding Society - the underlying survey data source. It also includes a variable, equivalent income, the calculation for which was developed by WS6. The data set includes measures of mortality, physical, and mental health, and composite measures combining mortality and health. It is open to data updates, and additional health indicators can be estimated and incorporated if required. Details on the individual indicator measures and broader domains are outlined in the Technical Report for the SIPHER Inclusive Economy Indicator Set – See Additional Resources. In addition to the DCE choice data, the surveys include participant self-reported data on: SIPHER-7; household size; age; gender; etc. Surveys (1) and (2) use the original SIPHER-7. Surveys (3) and (4) use the revised version of SIPHER-7. In addition to the inequality aversion task, the survey include participant self-reported data on: SIPHER-7; household size; age; gender; etc. The indicator sets and questions included in the survey: SIPHER-7; ICECAP-A; EQ-5D-5L; SF-12 v2; HUI; WEMWBS; EQ-HWB; ONS-4; Understanding Society items on crime and housing; items from the Labour Force Survey, the Living Wage Foundation questionnaire; education, income, ethnicity, children, informal caregiving; gender, age; etc. Includes sampling weights to correct for age and sex with respect to the mid-year UK population estimate.
Time Period The latest release is for 2021. The latest census data are used as constraints for the spatial microsimulation - the process generating the Synthetic Population. For England and Wales this is the 2021 Census. For Scotland this is the 2011 Census with a 2021 mid-year estimate constraint to ensure that totals by age and sex are up to date. DZ/LSOA/MSOA Level: typically, cross-sectional representing the period covered by the synthetic population. Local Authority level: typically, longitudinal for 2004-2020 when based on non-synthetic data. Data will be updated as new data becomes available. Longitudinal data are available for every year between 2017 and 2022. There are differences between variables in the availability of historical data. There are four datasets: (1) people’s personal preferences in autumn 2020; (2) people’s personal preferences in autumn 2021; (3) people’s personal preferences in spring 2022; (4) people’s social preferences in spring 2022. Dataset (2) includes returning respondents from (1). Otherwise, the observations are independent. Data collected: summer - autumn 2022. Data collected: late 2022.
Missing Data The level of missing information for a particular variable is determined by the levels of missingness in the underlying Understanding Society main stage survey. LLevel of missing data determined by data availability. Older data not always comparable across time or form for some indicators. Missing data are imputed. In some cases, only cross-sectional measurements were available, which were rolled forward/backward - for example in cases were elections did not take place every single year. Currently not available. Currently not available. Currently not available.
Examples / Link with Other Models and Data The Synthetic Population includes a derived variable ‘Equivalent Income’, which is calculated using the ‘Equivalent Income Calculator’ method developed by WS6. The Synthetic Population is used as the underlying data source in several SIPHER models. These include: (1) dynamical systems model (WS4), (2) static and dynamic microsimulation (WS3, WS5) and (3) decision support tool (WS7). Information covered in the Synthetic Population can be extended by adding additional variables from other data sources. These could be datasets that are not publicly available. A portfolio of area-level summary indicators on mortality, health, and composite indicators that combine information on mortality and health. These indicators can be attached as area-level indicators to the Synthetic Population. In addition, health measures are used in WS3 Local Authority clustering work, as well as in WS4 dynamical systems model. The data is currently used in a k-means clustering machine learning study. The primary aim of this study is to identify clusters of similar Local Authority Districts and the association of each cluster with a number of health outcomes. In another application, we currently examine the association between Quality-Adjusted Life Expectancy (QALE) and indicators of economic inclusion. The estimated parameters are used to calculate the equivalent income variable in the Synthetic Population. The estimated inequality aversion parameter is used to identify the optimal trade-off between maximising wellbeing and reducing inequality in the decision support tools. Currently not available.
Software Requirements Requires a software that can handle the size of the data file, such as R or Python. Requires a software that can handle the size of the data file, such as R or Python Requires a software that generally loads data, such as Excel, R, or Python The main choice data and respondent background variables are saved in Stata and require a software that can read in Stata files. The main trade-off data and respondent background variables are saved in Stata and require a software that can read in Stata files. Currently saved in Stata and requires a software that can read in Stata files.
Data Requirements / Restrictions To create the Synthetic Population, Understanding Society survey data and small-area census information are required. Understanding Society survey data can be downloaded from the website of UK Data Service, which requires an account and acceptance of the Understanding Society General License Agreement. Census data are available on the website of ONS/NRS. Information from both sources, survey and census data, need to be combined via the Flexible Modelling Framework software to create the Synthetic Population. We are currently working on making pre-build versions of the Synthetic Population more easily available via the UK Data Service’s data sharing infractructure. For key indicators such as QALE, Life Expectancy, and Lifespan Variation it is planned that a final version of the dataset and the underlying code will be made publicly available. In order to fully reproduce health measures requiring the Synthetic Population, access to the Synthetic Population is required. The final dataset will be made publicly available. Preliminary version can be shared upon request. Currently not available. Currently not available. Currently not available.
Data / Code Available Due to the underlying Understanding Society General License Agreement, the Synthetic Population dataset cannot be shared publicly as open access. However, we are planning to make the Synthetic Population more easily and more widely available via ReShare, the UK Data Service’s platform for sharing safeguarded data. This will allow interested users to work with a pre-build and validated Synethic Population dataset. Work in progress, final dataset will be made publicly available. Pipeline of code for estimation of Quality-Adjusted Life Expectancy (QALE) is available. Final dataset will be made publicly available. Code to derive indicators is extensive and can be shared upon request. Currently not available. Currently not available. The dataset will be archived at University of Sheffield’s data repository, ORDA. There is no associated code.
Training We will share a tutorial on ReShare on how to work with a pre-build Synthetic Population. In addition, a step-by-step tutorial illustrating how to create this data source by combining census and survey data is currently pending and will be made publicly available once completed. Online pipeline example via GitHub. The data is accompanied by a manual/data dictionary which provides context to all included variables. Currently not available. Currently not available. Currently not available.
Additional Resources
  1. Paper describing applied static microsimulation to create the Synthetic Population: https://www.nature.com/articles/s41597-022-01124-9
  2. Understanding Society Survey website: https://www.understandingsociety.ac.uk/
  3. UK Data Service website: https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=6614#!/details
  4. Flexible Modelling Framework which is used to create synthetic populations: https://github.com/MassAtLeeds/FMF/releases
Health indicators report: https://sipher.ac.uk/wp-content/uploads/2022/01/SIPHER-Health-Indicators-Report-V1.3.pdf and QALE exemplar: https://github.com/AndreasxHoehn/QALE_Exemplar https://sipher.ac.uk/wp-content/uploads/2022/10/SIPHER-Inclusive-Economy-Indicator-set.pdf https://sipher.ac.uk/collapsing-multidimensional-wellbeing-into-equivalent-income/ Currently not available. Currently not available.
Contact


Return to SIPHER homepage

Compare All Quantitative Products

Compare All Quantitative Products
Characteristic Dynamical Systems Model Static Microsimulation Dynamic Microsimulation Decision Support Tool K-Means Clustering Small-Area Indicator Estimation
Status In Development Ready In Development In Development Ready Ready
Main Perspective Population Level (Macro) Individual Level (Micro) Individual Level (Micro) From Individual Level (Micro) to Population Level (Macro) Population Level (Macro) Population Level (Macro)
Purpose This state-space dynamical system model provides a simulation of how each variable contained in the systems map will be affected over time, given specific changes to one or more variables. All studied variables (unemployment, poverty, health, etc.) have to be represented by the input data. Model provide results at the local authority level and allow us to compare system-level effects of different (or no) policy interventions over time. This static microsimulation, using a digital twin of the UK population as a data source, provides a granular picture of the impact of policy interventions. This model enables us to examine changes relatively quickly and with a relatively low amount of computational resources. It achieves this by simplifying the relationships and interconnections of an individual’s attributes. This dynamic microsimulation, using longitudinal survey data such as the SIPHER Synthetic Population, provides a very granular picture of the impact of policy interventions on different population groups. This model uses individual-level data and simulates the transitions of individuals across different states (such as health states) over time, based on a specific set of models describing these transitions. The decision support tool is not a model in itself. Rather, it uses the available SIPHER models to provide decision support to policy analysts. K-means clustering is a data-driven approach that allows users to identify clusters of local authorities based on their performance with respect to the utilised inclusive economy data collection. This enables the identification of more or less inclusive clusters. In addition, the association between these clusters and a number of selected local authority level health outcomes is examined. The estimation of area-level indicators for small geographical units such as Local Authorities, MSOAs, or LSOAs is challenging. For example, fluctuations in the number of deaths can introduce imprecision and fluctuations when estimating life expectancy. Typically, these challenges increase as the size of the geographical unit decreases. Therefore, we employ a suite of specific small-area estimation methods to address these challenges. This suite of methods can then be applied to both non-synthetic and synthetic sources of data, such as the synthetic population, to obtain area-level estimates for the dimensions captured in the Understanding Society main stage survey.
Strengths The model captures an entire system, including feedback loops to allow for the modelling of dynamic behaviour. In addition, the model allows the testing of policy changes ex-ante - rather than retrospectively. The model can capture both, increases and decreases (such as increases or decreases in funding). A particular strength of the model is that it enables the examination of immediate outcomes at level of individuals or households based on a policy change. Aggregating the outcomes allows a user to derive changes on the level of small geographies such as MSOA/LSOA, DZ, and local authorities. Models can provide immediate information on how many people will be affected, where those people live, and what their basic demographic characteristics are. Aggregation allows us to identify potential changes for specific geographical areas of interest. Designated longitudinal approach for the individual-level while outcomes can also be aggregated to reflect changes for population subgroups and geographical areas. Can search over many thousands of different intervention options (e.g. local communities, socio-demographic sub-groups, levels of intervention) to reveal trade-offs between outcomes. A summarising cluster solution clearly reduces complexity and leads to intuitive results. Outcomes have a straightforward meaning. Another strength of this approach lies in its ability to be updated and transferred to other sets of indicators or used over time. The suite of models aims to account for fluctuations and increase reliability of small-area estimates. This enables us to obtain reliable estimations given potentially unreliable data situations. The use of synthetic data can help to navigate situations in which no non-synthetic data would be available at all.
Limitations Any change to be modelled must be quantifiable by the model. This means that changes in variables which are not explicitly covered or for which there is no dependency will not become visible in the model. This implies that results are sensitive to pre-defined pathways which were specified in the systems map. Another limitation is posed by the assumption of known causal pathways between domains. This can be problematic in some cases and requires careful consideration and good justification. Furthermore, assumptions on the time frame for causal relationships needs strong justification and supporting information, which might not always be available. Finally, all modelled policy interventions need to attributable to the LA level. Limitations of the synthetic population apply. Interventions can be applied to specific variables, and outcomes applied to specific health variables. The decision support tool is dependent on SIPHER models and therefore subject to the limitations of these underlying models. Synthetic Population, Dynamical Systems Model and Dynamic Microsimulation can all be integrated but their limitations will then apply to the resulting decision support tool. It is important to note that the decision support tool is not intended to be used as a decision making tool. Rather the tool will provide a range of possible answers reflecting the trade-offs associated with potential decisions. The tool does not make any decisions - this responsibility rests with the user. In some cases, the achieved reduction in complexity might not be desired. It is a limitation that complete observations are required which often adds another preparatory step to the process (imputation of missing data). As a data-driven algorithm there are only limited options to intervene, for example with respect to the number of optimal clusters. Despite its advantages of dealing with small numbers, these methods cannot resolve situations where no data is available at all. The interpretation of results obtained from synthetic data needs care - for example, when interpreting very specific attributes for a very distinct geographical region.
Geography Local Authority level for Scotland/England/Wales LSOA/MSOA/DZ, and local authority level for Scotland, England, and Wales. DZ/LSOA Level for Scotland, England, and Wales Adopts the same geographical perspective as the SIPHER models that have been integrated - typically it is matched to the needs of the policy partner (so we have created Sheffield, Greater Manchester, Scotland (and Scottish LA) versions of the tool). The clustering is currently based on all local authorities in Scotland, England, and Wales. A previous application covered the LSOA level for selected English Local Authorities. The most common geographical level reflects the Local Authority level for England, Scotland, and Wales. In addition, estimates can be derived for the MSOA Level in England and Wales. Deriving estimates for the Intermediate Zone Level in Scotland is currently in progress. Due to the use of synthetic data, even smaller geographical resolutions can be achieved for some indicators.
Time Period Based on available and imputed data for previous years (currently 2004-2021). The models provides a dynamic 5-year forecast up to 2026 for each variable in the model. Corresponding to the period covered in the underlying Synthetic Population, for example based on Understanding Society wave k (2019-2021) The ‘jump off’ point for the scenarios is the latest period in the underlying Understanding Society input data (currently wave k (2019-2021). The ‘time horizon’ for the scenario is set at 2037. Adopts the same time period as the SIPHER models that have been integrated. Corresponds to the period covered in the underlying Synthetic Population, for example based on Understanding Society wave k (2019-2021) up to 2025/2026. The current approach is cross-sectional, covering the last available year (2020/2021). As data on inclusive economies is available for a much longer period, it is planned to study the stability of clusters over time. Estimates are available for 2004/2014 to 2020/2021 - dependant on indicator and underlying data sources. Data updates and suggestions of new indicators can be incorporated easily.
Adjustments / Extensions Factors which can be modified include: the underlying systems map (representing domains and their interactions), features of each respective intervention (including the amount of uplift or characteristics of recipients). In addition the method can be used to capture different systems (environment, housing etc.). All information describing individuals in all or only particular areas can be seen as potentially modifiable. For example, income, employment status, health etc. These interventions are typically informed by previous research and are often referred to as “the morning after” scenarios - situations., in which an immediate change to one or more individual-level factors has occurred instantaneously. Features of each respective intervention, including the amount of uplift or characteristics of recipients receiving the uplift. Potential adjustments include characteristics of the underlying models as well as features and the geographical granularity of the reported outcomes. Adjustments to the current model include the number of clusters, a designated focus on one or more UK Nations (Scotland, England or Wales) in isolation as well as the respective Inclusive Economy indicators and health outcomes considered. Data updates can be incorporated easily. Ideas for additional indicators are welcome and can be estimated given that suitable data is available in a synthetic ornon-synthetic source.
Data Requirements Aggregate level inputs for units of the studied geographical level (e.g. unemployment rate for the LA). Sufficient longitudinal data is required for all variables to validate the model. Cross-sectional data are of limited use but can be rolled out longitudinally. Domain-specific definitions need to be similar across all geographical units. Synthetic Population (see Product Guide details) Understanding Society (waves a-k). If spatial results are required, the latest version of the Synthetic Population (see data for details). The decision support tool requires results from other SIPHER models. In addition, information on the intervention as well as cost-effectiveness assumptions are required. Aggregate-level information for geographical areas on a selected set of indicators. Indicators can come from various different sources, but each indicator must have been measured consistently across observation units. For k-means to work properly, the level of missing information should be 0%. In case any information is missing, imputation methods can be utilised to achieve this requirement. This is indicator dependent. For some indicators, all required data is free and publicly available via ONS/NRS vital statistics data on population, deaths, and health outcomes. In particular for those indicators combining mortality and health information (e.g., QALE) access to the General and Special License of Understanding Society is required - depending on the level of geography required. If the underlying data source is synthetic data, such as the synthetic population, requirements of this source apply.
Applications Typical applications include a systems behaviour as a result of policy interventions, such as interventions to improve poverty, living wage, participation in employment, skills and qualification. In addition, this set of models can help to answer questions about the potential impact of direct policy responses to the current cost-of-living crisis. Number and characteristics of people affected by a financial uplift policy or labour market intervention as well as total costs of this policy for a particular geographical area Shocks and policy interventions which can be expressed as changes at the individual level. For example: changes to disposable income. Transition models need to be constructed for new problems. Applications include local community interventions on components of wellbeing; spatial targeting of job creation schemes; impact of targeted employment stimuli on health outcomes. The method is currently used to cluster local authorities based on inclusive economy indicators. It can be expanded to other indicator sets and domains as well as other outcome measures (environmental indicators). Estimated measures include measures of mortality such as life expectancy and lifespan variation, measures of health such as SF-12 instrument capturing physical and mental health, and composite measures combining health and mortality. Measures at the household-level related to cost-of-living are also available and can be obtained from synthetic sources.
Modelling Assumptions Models depend on a pre-defined systems map that describes how domains impact each other and which domains can be subject to interventions. These systems maps need to specify causal pathways between domains with pre-defined time lags. Models also depend on data to provide evidence for quantifying relationships. Assumptions of the Synthetic Population apply. The model relies on the assumption that transitions between states over time - representing the characteristics of an individual - can be modelled using a set of specified and measured characteristics of this individual. In addition, the Markov assumption needs to hold meanings that the time spent in a particular state (i.e. unemployed) does not have an impact on the probability of transitioning into other states (i.e. employed). Inherits the assumptions of the SIPHER models that have been integrated. In addition, assumptions on the costs and effectiveness of interventions are required. Clusters are identified based on the similarity observed units with respect to a number of defined domains. The major assumption is that small population sizes require specific methods to account for random fluctuations due to small numbers. A lot of measures, such mortality rates follow a very distinct pattern over age (standard trajectory) which requires knowledge of this approximate standard trajectory. When synthetic data is used, assumptions of the synthetic population apply.
User Options Which variable to change and by how much, corresponding to the policy intervention (or shock/absence of intervention) which is evaluated. Character, target group, and magnitude of particular interventions. In addition, the user can choose the geography level and select specific geographical reasons of interest. Character, target group, and magnitude of particular interventions. In addition, users can assess the impacts for LSOAs/DZs within a given area. Geographical and temporal focus. Intervention configuration options. The primary option for adjustment is the number of clusters. The most common options are the measure itself, the geographical resolution, and year.
User Type(s) Modellers, decision makers Modellers, decision makers, descriptive overview to inform statistical modelling Modellers, decision makers Modellers, decision makers Provides descriptive overview to inform decision making and modelling Outcomes are used as inputs in other models, for monitoring purposes, and can inform decision making.
Examples / Link with Other Models and Data Models of dynamic systems can inform individual-level approaches and help to validate results which were obtained in individual-level approaches. Works also in opposite direction: changes on individual-level which can be aggregated and expressed on LA level. This model requires SIPHER’s Synthetic Population. This model uses SIPHER’s Synthetic Population. The decision support tool uses the synthetic population, the systems dynamic model, the static and dynamic microsimulations, and the equivalent income utility function. Can inform the interpretation of WS4 models. In turn, can inform WS4 model input. Some of the derived health measures are used as input data in WS4 models, as outcomes for the association of clusters with health outcomes. In addition, some derived health measures can be attached to the synthetic population to represent area-level features as they cannot be derived directly from the synthetic population.
Software Requirement(s) Matlab. R or Python Python Python R R
Options for Extension Building different models for different systems. Modelling and quantifying uncertainty. All results can be combined with cost information where available to conduct cost-benefit analyses. Building different models for different interventions. Factors impacting transitions can be adjusted based on different contexts and assumptions. Alternative policy/intervention configurations. Other domains for which indicator sets exist or can be created (crime, transport, environment etc.). A k-means clustering approach can be applied to individual-level life course trajectories. Extension to a variety of small-area indicators is possible, such as age trajectories of fertility rates, employment rates, emergency admissions etc. In addition, different synthetic data sources can be utilised to create synthetic populations.
Links to Output and Examples In preparation, preliminary results available upon request. Paper describing applied static microsimulation to create the Synthetic Population: https://www.nature.com/articles/s41597-022-01124-9 For documentation visit: https://leeds-mrg.github.io/Minos/ and for code and more detailed user instructions visit: https://github.com/Leeds-MRG/Minos Software available here: https://ligerdev.shef.ac.uk/sipher-team/sipher_ws7_interventions and database here: https://ligerdev.shef.ac.uk/sipher-team/sipher_ws7_database Preliminary results available upon request An exemplary pipeline, estimating a range of health measures: https://github.com/AndreasxHoehn/QALE_Exemplar
Contact


Return to SIPHER homepage